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ABSTRACT 


This  technical  note  describes  a  region-based  data  structure  that  is 
easily  obtained,  lends  itself  to  description  of  the  data  in  a  rich  manner 
by  a  process  of  pointer  reduction,  and  reduces  combinatorial  types  of 
search  by  implicitly  including  positional  information.  The  structure  is 
also  context-free  and  closed. 
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I  INTRODUCTION  AND  PRELIMINARY  DEFINITIONS 

Considerable  attention  has  been  paid  to  developing  a  structure  to 
describe  regions  and  boundaries  of  a  picture.  To  what  extent  should  the 
topological  or  positional  information  be  explicated,^  or  rather  easily 
retrieved  from  the  structure?^  Should  a  special  coding  system  be 

developed  for  that  purpose ,3  ^7  gj.  should  the  current  facilities  of  list¬ 

processing  language  be  used?  Our  aim  was  to  develop  a  structure  that 
could  be  used  for  picture  analysis  and  thus  should  be  embedded  in  a  scene- 
description  program.  It  was  important  that  topological  information  be 
easily  retrieved  although  subject  to  change  as  the  picture  was  processed 
by  means  of  grouping  regions.  The  picture  representation  described  below 
is,  therefore,  oriented  to  provide  this  positional  information  in  a  flex¬ 
ible,  yet  immediately  usable,  form. 

Initially,  suppose  the  data  is  represented  by  an  N  x  N  matrix  (array). 
P  =  (P(I,J))  of  data  points  P(I,J)  (picture  points),  and  there  is  some 
kind  of  description  function  or  property  function 

j9;P  -♦  nD(I ) 

P(I,J)  (RED, LIGHT, FAR,  ...  ) 

where  each  D(I)  is  a  description  space  such  as  D  =  color,  D  =  intensity, 

A.  ^ 

D  =  range, . , , ,  etc. 
o 

The  purpose  here  is  to  describe  a  structure  whose  data  provides  a 
means  of  describing  the  contents  of  the  data  in  a  rich,  yet  economical, 
way  by  collecting  like  data  into  classes.  This  structure  also  is  so  con¬ 
structed  to  give  great  ease  of  manipulating  both  the  original  data  and  the 
data  of  the  structure  itself.  The  natural  way  to  collect  like  data  is  to 
partition  the  array  into  equivalence  classes  based  on  the  description 
function  j5. 


* 

References  are  listed  at  the  end  of  the  body  of  this  technical  note. 
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One  such  meaningful  partition  would  be  the  partition  of  matrix  P 
by  the  relation 

(P(I,J)  ~  P(K,L))  IFF  j9(P(I,J))  =  )9(P(K,L))  and  C  (P  (I,  J)  ,  P  (K,  L)  )  )  (1) 

where  C  is  the  predicate 

(C(X,Y)  =  T)  IFF  (X  IS  CONlfECTED  TO  Y) 
where  "connected"  means  the  existence  of  a  sequence  of  points 
(P(I,J)  =  .. .,S^  =  P(K,L) 

where  for  each  J  = 1 j . . . ,N  .S  ~  P(I , J)  and  S  ,  S  are  four  neighbors.® 

J  ij  J  ij“l 

This  partitions  P  into  connected  homogeneous  sets,  which  we  shall  call 
regions . 

While  other  partitions  could  be  used,  this  partition  has  the  ad¬ 
vantage  that  no  information  is  lost,  in  the  sense  that  the  elements 
gathered  together  are  not  differentiable,  and  certain  topological  con¬ 
siderations,  described  below,  are  preserved  in  an  easily  manageable  form. 

This  first  partition  begins  the  formation  of  the  data  structure, 
forming  the  basic  regions,  called  the  elementary  regions.  The  structure 
then  encompasses  the  capability  of  conveniently  joining  regions  in  such 
a  way  as  to  form  a  description  tree  (as  in  Fig.  1)  where  each  node  is 
determined  by  a  region,  and  each  level  is  a  new  partition  of  the  matrix. 

II  THE  STRUCTURE  ARCS  (A  REGION-ORIENTED  STRUCTURE) 

The  matrix  P  =  (P(I,J))  which  provides  the  initial  problem  can  be 
viewed  as  in  Fig.  2;  that  is,  each  point  P(I,J)  is  surrounded  by  four 
vectors — a  counterclockwise-oriented  boundary  of  the  point.  Formally 
then  the  boundary  (denoted  by  d)  of  the  point  may  be  written  as 

o(P(I,J)>  =  V^(P(I,J))  +  -  V^(P(I,J))  , 

where  the  Vj^(P(I,J))  are  the  vectors  of  Fig.  3, 
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FIGURE  1  A  SIMPLIFIED  TREE  DESCRIPTION  OF  THE  IDEAL  CUBE  SHOWN 
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FIGURE  2  TWO  POINTS  OF  A  PICTURE  MATRIX  WITH  THE  SURROUNDING 
VECTORS  SHOWN 
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V^(P[!,J1)  TA-7494-10 

FIGURE  3  A  MATRIX  POINT  WITH  THE  FOUR  SURROUNDING  VECTORS  OF  ITS 
BOUNDARY  LABELED  AS  IN  THE  TEXT 


Now,  subject  to  the  relations 

V^(P(I,J))  =  -V^CPd  +  1,J)) 

VgCPCl^J))  =  -V^(P(I,J  +  D) 

it  is  apparent  that  the  boundary  of  any  set  of  points  X  =  {p(I,J)]  is 

a(x)  =  s  o(p(i,j)) 

P(I,J)eX 
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and,  in  general,  the  boundary  of  any  collection  disjoint 

sets  is 


^(X  ) 

n 


In  fact,  this  representation,  though  only  instructive,  is  the  formal 
background  of  what  follows. 

The  algorithm  (PARTITION)  and  description  of  Appendix  A  of  this 
report  give  the  method  by  which  one  can  partition  an  arbitrary  M  X  M 
matrix  into  elementary  regions,  each  of  which  is  represented  by  its 
boundary,  as  described  above.  The  structure  could  be  considered  as  a 
triple 

S  =  (DATA, TOPOLOGY, OPERATION) 


A .  D  at  a 

Given  the  original  data  S  (the  set  of  all  the  points  of  the  a 
array  considered  as  singletons),  (3  determines  u,  the  power  set  of  <3  (the 
set  of  all  subsets  of  &) .  The  domain  of  the  structure  data  is  the  set 
C  of  all  connected  elements  of  u.  The  algorithm  partition  yields  a  sub¬ 
set  of  C,  which  is  defined  as  the  set  of  all  the  homogeneous  elements 
of  (3  determined  by  the  equivalence  relation  (1)  (see  Fig.  4).  Further¬ 
more,  there  is  also  a  function  S  on  ©  onto  l4  which  is  defined  by 

M 

o  ->  h 

where  h  is  the  unique  element  of  y  such  that  o  c  h. 

The  representation  of  each  region  in  C  is  by  its  boundary, 
which  is  a  list  of  its  components — the  closed  curves  of  the  boundary 
(a  region  may  be  multiply-connected®).  These  curves  are  then,  in  turn, 
a  list  of  vectors — ordered  as  they  are  geometrically  (see  Fig.  5). 

B.  Topology 

Implicit  in  this  structure  is  a  great  deal  of  easily  extract- 
able  topological  (and  indeed  metric)  information.  In  particular,  we 
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FIGURE  4  THE  DATA  LATTICE  SHOWING  THE  RELATIONSHIP  Q?  U  .  C  ,  (?.  AND 
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FIGURE  5  ■  A  TYPICAL  BOUNDARY  OF  VECTORS  (a)  AND  ITS  CORRESPONDING 
LIST  (b) 


may  consider  the  matrix  P  as  embedded  in  E  — the  Euclidean  plane.  Thus^ 
we  can  easily  extract  information  such  as  what  is  to  the  left  or  right 
of  a  vector. 


Thus,  given  the  vector  Vj^(P{I,J))=  (  Y,^),(X^,Y^)  ) ,  the  point 

P  =  {Xp,Yp)  to  the  right  is  given  by  the  formula 


Knowing  what  is  to  the  right  or  left  of  a  vector  makes  finding 
the  neighbors  (i.e.,  regions  whose  boundaries  are  tangent  to  the  boundary 
of  a  given  region)  trivial  by  virtue  of  the  function  S-. 
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Equally  trivial  is  the  question  of  whether  a  point  is  inside 
or  outside  a  boundary,  or  any  question  as  to  the  connectivity  of  the 
region  (multiply-connected)  . 

C.  Operations 

In  order  to  be  able  to  modify  the  structure  as  well  as  describe 
and  retrieve  information,  an  operation  named  MERGE  was  defined  on  dis¬ 
joint  pairs  in  C  of  Section  A  that  joins  the  two  regions,  creates  the 
new  boundaries,  and  saves  the  local  properties  of  each  element.  It  may 
happen  that  we  wish  to  cut  a  region  into  two  by  a  curve  (as  determined 
perhaps  by  some  higher  level  information),  but  it  turns  out  that  MERGE 
here  will  suffice  once  we  define  a  negative  boundary,  which  is  just 

-d(P(I,J)  =  (-V^)  +  (-V^)  +  (-Vg)  +  (-V^) 

-d(X)  =  2  -d(p) 

Pex 

Now  we  find  X  such  that  MERGE  (-X,S),X  are  the  desired  regions.  We  note 
that,  by  definition,  under  the  operation  MERGE,  the  data  is  closed,  con¬ 
text-independent,  syntactically-descriptive,  and  flexible. 

Ill  THE  PROBLEM  DOMAIN 

One  can  think  of  the  problem  domain  as  being  the  possible  set  of 
nodes  for  the  description  tree.  Each  node  of  the  tree  represents  a  region 
of  C,  and  associated  with  it  is  a  summary  of  the  description  of  the  re¬ 
gion  it  represents. 

Trivially,  one  could  consider  all  possible  nodes  of  the  tree  subject 
to  the  constraints  of  the  property  function  &  and  build  them  up  as  they 
make  sense,  but  this  is  combinatorial.  So,  a  more  realistic  way  is  to 
use  proper  heuristics  to  guide  the  path  so  that  most  merges  are  con¬ 
structive,  in  the  sense  that  the  new  nodes  constructed  at  each  step  tend 
(in  some  sense)  to  be  closer  to  the  goal.  We  give  an  example  in  scene 
analysis . 

The  Stanford  Research  Institute  robot  lives  in  a  limited  environ¬ 
ment  in  which  the  class  of  interesting  scenes  contain  only  cubes,  wedges, 
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and  background.  This  class  of  scenes  does  not  illustrate  the  more  power 
ful  aspects  of  the  above  structure,  but  it  is  the  only  example  used  so 
far.  It  will  suffice  as  an  illustration. 

We  let  the  description  function  &  be 

il(P(I,J))  =  GRAYSCALE  OF  P(I,J) 

and  we  partition  according  to  the  above  relation  (1).  The  elementary  re 
gions  are  then  regions  of  uniform  grayscale. 

Then,  using  the  result  of  this  pass  and  starting  from  one  region, 
R^,  we  grow  the  region  by  merging  those  neighbors  that  satisfy  the 
following  criterion: 

If 


R^  is  a  neighbor  of  R^, 
and  if  we  put 


I 


w  V 
V 


where 


w 


k 


!1  if  the  gray  scales  of  the  point  to  the  right 
and  to  the  left  of  v  are  different  by  1 

0  otherwise 


and 

J  is  the  part  of  the  boundary  of  R^^  tangent  to  R^, 
and  if 


P  is  the  smallest  of  the  perimeters  of  R,  and  R  , 

In 

then  the  condition  is  satisfied  if  I/P  >  9,  where  9  is  a  threshold 
(we  have  used  0,5) . 

We  continue  to  grow  until  this  criterion  is  never  satisfied  for  any 
neighbors  of  R^ .  The  whole  picture  or  any  portion  of  the  picture  can 
be  processed  in  this  fashion. 

The  result  of  this  heuristic  (see  Fig.  6)  is  that  we  are  ready  to 
attempt  a  description  of  the  scene. 
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IV  CONCLUSION 


This  type  of  representation  does  not  pretend  a  great  efficiency  in 
coding  or  storing  pictorial  information.  However,  it  is  a  structure 
that  provides  the  ease  of  manipulation  and  topological  richness  neces¬ 
sary  for  picture-handling  and  description;  it  has  the  added  advantage 
of  being  completely  general . 
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(a)  Tha  picture  partitioned  into 
its  elementBry  regions 


(bt  One  of  the  large  regions  of 
the  picture  together  with 
the  neighbors  that  satisfy 
the  merging  critarion 


(c)  Merge  of  all  the  regions  that 
satisfy  the  merging  condition 
(after  several  iterations) 


TA-7a94-13 


FIGURE  6  A  TYPICAL  PICTURE  PROCESSED  WITHIN  THE  STRUCTURE 
OF  THE  TEXT 
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(d)  Four  regions  such  as  in 
(cl  displayed  together 


(e)  The  regions  of  (d)  in  their 
background 


(f|  Picture  completely  processed 
in  terms  of  the  merging 
heuristic 


FIGURE  6  A  TYPICAL  PICTURE  PROCESSED  WITHIN  THE  STRUCTURE 
OF  THE  TEXT  Concluded 
11 
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Appendix  A 


CONNECTED  COMPONENTS  ALGORITHM 


1 .  Purpose  of  the  Algorithm 

The  algorithm  described  below  partitions  an  arbitrary  (M  x  N)  pic¬ 
ture  array  into  homogeneous  connected  components  and  represents  each 
such  region  by  the  curves  that  make  up  its  boundary, 

2 ,  Logical  Description 

The  algorithm  to  find  the  connected  components  works  on  an  array 
(N  X  N)  of  picture  elements  P(I,J),  each  of  which  points  to  the  list  of 
its  properties  (intensity,  color,  texture,,.,).  A  two-dimensional 
Cartesian  coordinate  system  is  assumed  where  the  elements  of  the  picture 
stand  in  the  positions  defined  by  odd  coordinates.  The  positions  with 
even  coordinates  are  the  positions  of  the  endpoints  of  the  vectors  used 
to  separate  picture  elements  (Fig.  A-1). 


Connectivity  is  defined  by  assuming  that  each  point  has  four  neigh¬ 
bors  (Fig.  A-1)  and  that  two  points — P(I,J)  and  P(K,L) — are  connected  if 
there  exists  a  sequence  of  points 


P(I,J) 


P(K,L) 


such  that  S  ~  S  and  S  and  S,,  are  neighbors.  (~  is  for  the 

%-l  “n  ^N-1  % 

equivalence  relation  between  the  elements  of  the  picture.) 

Given  the  equivalence  relation  this  definition  of  connectivity, 
and  an  array  of  picture  elements,  the  algorithm  finds  the  connected 
components  homogeneous  with  respect  to  ~  (regions)  and  their  boundaries 
(the  closed  curves  bounding  the  region). 

Each  picture  element  of  the  original  array  is  replaced  by  the  name 
of  the  region  to  which  it  belongs.  The  name  of  the  region  points  to  the 
list  of  the  common  properties  of  this  region  and  to  the  boundary  of  that 
region. 


A-1 


In  general,  a  boundary  separates  two  or  more  regions;  parts  of  the 
boundary  or  even  the  whole  boundary  of  a  given  region  are  also  parts  of 
the  boundary  of  another  region.  A  simple  representation  is  used  here 
that  avoids  the  orientation  problem  or  multiply-branching  nodes  is  to 
create  for  each  region  its  own  boundary,  so  that  two  contiguous  regions 
do  not  share  parts  of  the  same  boundary  (Fig.  A-1) . 

The  boundary  is  computed  as  an  ordered  list  of  vectors  of  unit  length 
such  that,  for  each  vector  of  the  boundary  of  a  given  region,  the  region 
is  on  the  left  side  of  this  vector,  and  such  that  any  vector  of  the  list 
is  followed  in  the  list  by  the  vector  that  follows  it  geometrically.  Any 
vector  is  connected  at  least  to  one  and  at  most  to  two  other  vectors. 

The  algorithm  itself  applies  the  L-shaped  window  of  Fig.  A-2  to  each 
point  of  the  picture,  scanning  in  a  raster  and  performing  what  is  best 
described  by  the  flow  chart  of  Fig.  A-3. 

To  process  the  edges  corresponding  to  the  frame  of  the  picture,  the 
picture  is  expanded  by  one  element  on  all  sides  to  a  picture  of  (N  +  2)  X 
(N  +  2)  elements,  by  adding  a  region  with  such  values  of  the  properties 
that  the  equivalence  is  never  true  with  any  element  of  the  original  picture. 

Since  any  path  between  two  points  of  the  picture  satisfying  the  de¬ 
finition  of  connectivity  could  be  generated  by  this  type  of  window  applied 
to  every  point  of  the  picture,  it  is  obvious  that  this  algorithm  will  give 
the  connected  components  of  the  picture.  However,  it  is  possible  that  one 
region  could  be  first  recognized  as  two  or  more  different  regions  at  the 
beginning  of  the  process  (see  Fig.  A-4)  ;  they  will  be  joined  further  on 
under  the  same  name.  The  partial  results  computed  (boundaries)  will  be 
also  joined. 

A  vector  created  by  the  window  is  added  to  the  list  of  vectors  re¬ 
presenting  a  component  of  the  boundary  (since  a  boundary  could  have  more 
than  one  component  if  there  are  holes  in  the  region)  if  and  only  if  one 
of  the  endpoints  of  this  vector  is  the  same  as  one  of  the  free  endpoints 
of  the  list;  otherwise,  this  vector  begins  a  new  co.mponent.  Then,  when  a 
component  has  been  changed  by  adding  a  vector  or  a  list  of  vectors,  the 
connection  of  this  component  with  the  other  components  of  the  boundary 
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FIGURE  A-1  A  SAMPLE  OF  A  PICTURE  ARRAY  WITH  THE  BOUNDARY  VECTORS 
surrounding  a  connected  component,  namely  the  O's 
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FIGURE  A-2  THE  WINDOW  FOR  THE  PARTITION  ALGORITHM 


is  tried.  It  is  possible  that  two  originally  separated  components  will 
be  joined  later  (Fig.  A-4) . 

3 .  Pictures 

This  algorithm  was  developed  for  a  scene-analysis  purpose,  to  de¬ 
scribe  the  picture  in  a  "region-oriented  sense"  and  to  provide  a  struc¬ 
ture  flexible  enough  to  be  able  to  .nodify  regions  by  joining  them  to¬ 
gether.  The  algorithm  was  tested  on  grayscale  pictures  of  different 
sizes  (from  120  x  120  to  60  X  60)  of  geometrical  objects,  with  16  levels 
of  gray.  The  equivalence  relation  was  that  the  grayscale  of  two  con¬ 
tiguous  elements  should  be  equal. 
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FIGURE  A-3  FLOW  CHART  OF  THE  CONNECTED  COMPONENTS  ALGORITHM 
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It  is  planned  to  use  it  in  another  kind  of  environment  for  pictures, 
taking  into  account  different  properties  like  color,  range  of  each  point 
for  stereo  pictures,  and  a  kind  of  texture  measurement  attached  to  every 
point . 

4 ,  Implementation 

At  present,  this  algorithm  is  implemented  at  SRI  in  LISP  1.5  on  the 
Artificial  Intelligence  Group's  SDS-940  as  a  study  in  picture-processing. 
LISP  was  chosen  for  this  study  because  it  was  the  most  natural  available 
language  in  which  to  do  this  type  of  processing,  since  there  are  numerous 
lists  constructed  that  by  nature  are  of  indefinite  length,  and  the  struc¬ 
ture  of  the  language  itself  lends  itself  to  the  type  of  structure  we  have 
described. 

In  this  implementation  we  represent  the  original  picture  as  an  array, 
each  element  of  which  is  a  pointer  to  a  list  of  properties  (a  property 
list).  The  function  partition  then  has  this  array  as  data  and  returns 
the  above  structure.  The  array  elements  now  point  to  the  name  of  the 
elementary  region  to  which  they  belong  (an  atom  created  for  this  purpose). 
This  atom  then  has  on  its  property  list  the  properties  common  to  all  the 
points  of  that  region,  and  the  value  of  that  atom  is  the  boundary  of  the 
region.  This  boundary  is  a  list  of  its  components,  which  are  in  turn 
lists  of  the  vectors  of  which  they  are  composed. 

The  functions  D  and  E  are  made  up  of  GETP  and  SETA  respectively 
(see  Fig.  A-5) . 

The  disadvantage  to  LISP  on  the  SDS-940  is  that  it  is  very  slow. 

This  inefficiency  could  be  reduced  by  the  use  of  some  special  language, 
or,  perhaps  better  still,  by  LISP  on  a  LISP-oriented  rather  than  a  FORTRAN- 
oriented  machine  (that  is,  one  with  a  large  core  and  a  proper  set  of  machine 
instructions) . 

One  plan  to  speed  up  computation  (whatever  the  machine)  is  to  use  a 
zoom  technique.  One  takes  an  N  X  N  picture  and  reduces  the  resolution  to, 
say,  1/4  N  or  even  more,  and  processes  the  course  picture  first  to  find  the 
section  of  interest.  Once  the  desired  information  is  extracted,  ambiguities 
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FIGURE  A-4  A  PARTIALLY  PROCESSED  PICTURE  (R,  and  Rg  are  first  thought 
of  as  separate  regions.  It  is  not  until  all  of  R^  is  processed  that 
and  Rg  are  joined,) 


ARRAY  ID  - ►  RK  ^  BRK  *  ((VI . Vn)  .  .  .  ) 

- ►  (COLOR  RED  INTENSITY  S  .  .  .) 
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FIGURE  A-5  A  DIAGRAM  OF  THE  IMPLEMENTED  STRUCTURE  IN  LISP  AT  SRI. 

The  function  oO  is  composed  from  the  LISP  function  GETP,  and  ^  is 
essentially  EVAL. 

are  resolved  as  much  as  possible  by  increasing  the  resolution  locally. 

It  has  been  found  that  often  the  figures  of  a  picture  can  be  determined 
in  a  much  reduced  resolution  and  that  the  high  resolution  is  needed  only 
locally. 
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